AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Neural Information Processing SystemsJan-19-2025, 18:34:40 GMT

Correlation Aware Sparsified Mean Estimation Using Random Projection

correlation aware sparsified mean estimation, correlation information, rand-proj-spatial, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Neural Information Processing SystemsMar-15-2024, 10:00:18 GMT

EigenNet: A Bayesian hybrid of generative and conditional models for sparse learning Yuan Qi

For many real-world applications, we often need to select correlated variables-- such as genetic variations and imaging features associated with Alzheimer's disease--in a high dimensional space. The correlation between variables presents a challenge to classical variable selection methods. To address this challenge, the elastic net has been developed and successfully applied to many applications. Despite its great success, the elastic net does not exploit the correlation information embedded in the data to select correlated variables. To overcome this limitation, we present a novel hybrid model, EigenNet, that uses the eigenstructures of data to guide variable selection.

eigennet, eigenvector, elastic net, (15 more...)

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Zhang, Zishi, Peng, Yijie

Sample-Efficient Clustering and Conquer Procedures for Parallel Large-Scale Ranking and Selection

arXiv.org Artificial IntelligenceFeb-3-2024

We propose novel "clustering and conquer" procedures for the parallel large-scale ranking and selection (R&S) problem, which leverage correlation information for clustering to break the bottleneck of sample efficiency. In parallel computing environments, correlation-based clustering can achieve an $\mathcal{O}(p)$ sample complexity reduction rate, which is the optimal reduction rate theoretically attainable. Our proposed framework is versatile, allowing for seamless integration of various prevalent R&S methods under both fixed-budget and fixed-precision paradigms. It can achieve improvements without the necessity of highly accurate correlation estimation and precise clustering. In large-scale AI applications such as neural architecture search, a screening-free version of our procedure surprisingly surpasses fully-sequential benchmarks in terms of sample efficiency. This suggests that leveraging valuable structural information, such as correlation, is a viable path to bypassing the traditional need for screening via pairwise comparison--a step previously deemed essential for high sample efficiency but problematic for parallelization. Additionally, we propose a parallel few-shot clustering algorithm tailored for large-scale problems.

correlation-based clustering, parallel correlation-based clustering, zhang and peng, (12 more...)

2402.02196

Country:

Asia > China > Hunan Province > Changsha (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

arXiv.org Artificial IntelligenceMay-14-2023

Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation

Zhang, Songming, Liang, Yunlong, Wang, Shuaibo, Han, Wenjuan, Liu, Jian, Xu, Jinan, Chen, Yufeng

Knowledge distillation (KD) is a promising technique for model compression in neural machine translation. However, where the knowledge hides in KD is still not clear, which may hinder the development of KD. In this work, we first unravel this mystery from an empirical perspective and show that the knowledge comes from the top-1 predictions of teachers, which also helps us build a potential connection between word- and sequence-level KD. Further, we point out two inherent issues in vanilla word-level KD based on this finding. Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a novel method named \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD). Specifically, we design a hierarchical ranking loss to enforce the learning of the top-1 information from the teacher. Additionally, we develop an iterative KD procedure to infuse more additional knowledge by distilling on the data without ground-truth targets. Experiments on WMT'14 English-German, WMT'14 English-French and WMT'16 English-Romanian demonstrate that our method can respectively boost Transformer$_{base}$ students by +1.04, +0.60 and +1.11 BLEU scores and significantly outperform the vanilla word-level KD baseline. Besides, our method shows higher generalizability on different teacher-student capacity gaps than existing KD techniques.

computational linguistic, information, word-level kd, (15 more...)

2305.08096

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Dominican Republic (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceMar-10-2023

Zero-Shot Object Searching Using Large-scale Object Relationship Prior

Chen, Hongyi, Xu, Ruinian, Cheng, Shuo, Vela, Patricio A., Xu, Danfei

Home-assistant robots have been a long-standing research topic, and one of the biggest challenges is searching for required objects in housing environments. Previous object-goal navigation requires the robot to search for a target object category in an unexplored environment, which may not be suitable for home-assistant robots that typically have some level of semantic knowledge of the environment, such as the location of static furniture. In our approach, we leverage this knowledge and the fact that a target object may be located close to its related objects for efficient navigation. To achieve this, we train a graph neural network using the Visual Genome dataset to learn the object co-occurrence relationships and formulate the searching process as iteratively predicting the possible areas where the target object may be located. This approach is entirely zero-shot, meaning it doesn't require new accurate object correlation in the test environment. We empirically show that our method outperforms prior correlational object search algorithms. As our ultimate goal is to build fully autonomous assistant robots for everyday use, we further integrate the task planner for parsing natural language and generating task-completing plans with object navigation to execute human instructions. We demonstrate the effectiveness of our proposed pipeline in both the AI2-THOR simulator and a Stretch robot in a real-world environment.

large language model, machine learning, natural language, (18 more...)

2303.06228

Country: North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

arXiv.org Artificial IntelligenceJul-16-2022

A Correlation Information-based Spatiotemporal Network for Traffic Flow Forecasting

Zhu, Weiguo, Sun, Yongqi, Yi, Xintong, Wang, Yan

The technology of traffic flow forecasting plays an important role in intelligent transportation systems. Based on graph neural networks and attention mechanisms, most previous works utilize the transformer architecture to discover spatiotemporal dependencies and dynamic relationships. However, they have not considered correlation information among spatiotemporal sequences thoroughly. In this paper, based on the maximal information coefficient, we present two elaborate spatiotemporal representations, spatial correlation information (SCorr) and temporal correlation information (TCorr). Using SCorr, we propose a correlation information-based spatiotemporal network (CorrSTN) that includes a dynamic graph neural network component for integrating correlation information into spatial structure effectively and a multi-head attention component for modeling dynamic temporal dependencies accurately. Utilizing TCorr, we explore the correlation pattern among different periodic data to identify the most relevant data, and then design an efficient data selection scheme to further enhance model performance. The experimental results on the highway traffic flow (PEMS07 and PEMS08) and metro crowd flow (HZME inflow and outflow) datasets demonstrate that CorrSTN outperforms the state-of-the-art methods in terms of predictive performance. In particular, on the HZME (outflow) dataset, our model makes significant improvements compared with the ASTGNN model by 12.7%, 14.4% and 27.4% in the metrics of MAE, RMSE and MAPE, respectively.

artificial intelligence, deep learning, machine learning, (18 more...)

2205.10365

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > California (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.48)

Industry:

Consumer Products & Services > Travel (0.83)
Transportation > Ground > Road (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-10-2021

Aligning Correlation Information for Domain Adaptation in Action Recognition

Xu, Yuecong, Yang, Jianfei, Cao, Haozhi, Mao, Kezhi, Yin, Jianxiong, See, Simon

Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research towards video DA. This is partly due to the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-term dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore we propose a novel Adversarial Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as Pixel Correlation Discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.

correlation feature, dataset, recognition, (11 more...)

2107.04932

Country:

Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsDec-31-2011

EigenNet: A Bayesian hybrid of generative and conditional models for sparse learning

Yan, Feng, Qi, Yuan

For many real-world applications, we often need to select correlated variables---such as genetic variations and imaging features associated with Alzheimer's disease---in a high dimensional space. The correlation between variables presents a challenge to classical variable selection methods. To address this challenge, the elastic net has been developed and successfully applied to many applications. Despite its great success, the elastic net does not exploit the correlation information embedded in the data to select correlated variables. To overcome this limitation, we present a novel hybrid model, EigenNet, that uses the eigenstructures of data to guide variable selection. Specifically, it integrates a sparse conditional classification model with a generative model capturing variable correlations in a principled Bayesian framework. We develop an efficient active-set algorithm to estimate the model via evidence maximization. Experiments on synthetic data and imaging genetics data demonstrated the superior predictive performance of the EigenNet over the lasso, the elastic net, and the automatic relevance determination.

artificial intelligence, eigennet, machine learning, (19 more...)

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)